48 research outputs found

    Constant Factor Approximation for Balanced Cut in the PIE model

    Full text link
    We propose and study a new semi-random semi-adversarial model for Balanced Cut, a planted model with permutation-invariant random edges (PIE). Our model is much more general than planted models considered previously. Consider a set of vertices V partitioned into two clusters LL and RR of equal size. Let GG be an arbitrary graph on VV with no edges between LL and RR. Let ErandomE_{random} be a set of edges sampled from an arbitrary permutation-invariant distribution (a distribution that is invariant under permutation of vertices in LL and in RR). Then we say that G+ErandomG + E_{random} is a graph with permutation-invariant random edges. We present an approximation algorithm for the Balanced Cut problem that finds a balanced cut of cost O(Erandom)+npolylog(n)O(|E_{random}|) + n \text{polylog}(n) in this model. In the regime when Erandom=Ω(npolylog(n))|E_{random}| = \Omega(n \text{polylog}(n)), this is a constant factor approximation with respect to the cost of the planted cut.Comment: Full version of the paper at the 46th ACM Symposium on the Theory of Computing (STOC 2014). 32 page

    Monotone Maps, Sphericity and Bounded Second Eigenvalue

    Get PDF
    We consider {\em monotone} embeddings of a finite metric space into low dimensional normed space. That is, embeddings that respect the order among the distances in the original space. Our main interest is in embeddings into Euclidean spaces. We observe that any metric on nn points can be embedded into l2nl_2^n, while, (in a sense to be made precise later), for almost every nn-point metric space, every monotone map must be into a space of dimension Ω(n)\Omega(n). It becomes natural, then, to seek explicit constructions of metric spaces that cannot be monotonically embedded into spaces of sublinear dimension. To this end, we employ known results on {\em sphericity} of graphs, which suggest one example of such a metric space - that defined by a complete bipartitegraph. We prove that an δn\delta n-regular graph of order nn, with bounded diameter has sphericity Ω(n/(λ2+1))\Omega(n/(\lambda_2+1)), where λ2\lambda_2 is the second largest eigenvalue of the adjacency matrix of the graph, and 0 < \delta \leq \half is constant. We also show that while random graphs have linear sphericity, there are {\em quasi-random} graphs of logarithmic sphericity. For the above bound to be linear, λ2\lambda_2 must be constant. We show that if the second eigenvalue of an n/2n/2-regular graph is bounded by a constant, then the graph is close to being complete bipartite. Namely, its adjacency matrix differs from that of a complete bipartite graph in only o(n2)o(n^2) entries. Furthermore, for any 0 < \delta < \half, and λ2\lambda_2, there are only finitely many δn\delta n-regular graphs with second eigenvalue at most λ2\lambda_2

    On the practically interesting instances of MAXCUT

    Get PDF
    The complexity of a computational problem is traditionally quantified based on the hardness of its worst case. This approach has many advantages and has led to a deep and beautiful theory. However, from the practical perspective, this leaves much to be desired. In application areas, practically interesting instances very often occupy just a tiny part of an algorithm's space of instances, and the vast majority of instances are simply irrelevant. Addressing these issues is a major challenge for theoretical computer science which may make theory more relevant to the practice of computer science. Following Bilu and Linial, we apply this perspective to MAXCUT, viewed as a clustering problem. Using a variety of techniques, we investigate practically interesting instances of this problem. Specifically, we show how to solve in polynomial time distinguished, metric, expanding and dense instances of MAXCUT under mild stability assumptions. In particular, (1+ϵ)(1+\epsilon)-stability (which is optimal) suffices for metric and dense MAXCUT. We also show how to solve in polynomial time Ω(n)\Omega(\sqrt{n})-stable instances of MAXCUT, substantially improving the best previously known result

    The design of transcription-factor binding sites is affected by combinatorial regulation

    Get PDF
    BACKGROUND: Transcription factors regulate gene expression by binding to specific cis-regulatory elements in gene promoters. Although DNA sequences that serve as transcription-factor binding sites have been characterized and associated with the regulation of numerous genes, the principles that govern the design and evolution of such sites are poorly understood. RESULTS: Using the comprehensive mapping of binding-site locations available in Saccharomyces cerevisiae, we examined possible factors that may have an impact on binding-site design. We found that binding sites tend to be shorter and fuzzier when they appear in promoter regions that bind multiple transcription factors. We further found that essential genes bind relatively fewer transcription factors, as do divergent promoters. We provide evidence that novel binding sites tend to appear in specific promoters that are already associated with multiple sites. CONCLUSION: Two principal models may account for the observed correlations. First, it may be that the interaction between multiple factors compensates for the decreased specificity of each specific binding sequence. In such a scenario, binding-site fuzziness is a consequence of the presence of multiple binding sites. Second, binding sites may tend to appear in promoter regions that are subject to low selective pressure, which also allows for fuzzier motifs. The latter possibility may account for the relatively low number of binding sites found in promoters of essential genes and in divergent promoters

    Conservation of Expression and Sequence of Metabolic Genes Is Reflected by Activity Across Metabolic States

    Get PDF
    Variation in gene expression levels on a genomic scale has been detected among different strains, among closely related species, and within populations of genetically identical cells. What are the driving forces that lead to expression divergence in some genes and conserved expression in others? Here we employ flux balance analysis to address this question for metabolic genes. We consider the genome-scale metabolic model of Saccharomyces cerevisiae, and its entire space of optimal and near-optimal flux distributions. We show that this space reveals underlying evolutionary constraints on expression regulation, as well as on the conservation of the underlying gene sequences. Genes that have a high range of optimal flux levels tend to display divergent expression levels among different yeast strains and species. This suggests that gene regulation has diverged in those parts of the metabolic network that are less constrained. In addition, we show that genes that are active in a large fraction of the space of optimal solutions tend to have conserved sequences. This supports the possibility that there is less selective pressure to maintain genes that are relevant for only a small number of metabolic states
    corecore